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Abstract 

We compare two approaches to embedding joint distributions of random variables recorded under different conditions 
(such as spins of entangled particles for different settings) into the framework of classical, Kolmogorovian probability theory. 
In the contextualization approach each random variable is "automatically" labeled by all conditions under which it is 
recorded, and the random variables across a set of mutually exclusive conditions are probabilistically coupled (imposed a 
joint distribution upon). Analysis of all possible probabilistic couplings for a given set of random variables allows one to 
characterize various relations between their separate distributions (such as Bell-type inequalities or quantum-mechanical 
constraints). In the conditionalization approach one considers the conditions under which the random variables are 
recorded as if they were values of another random variable, so that the observed distributions are interpreted as conditional 
ones. This approach is uninformative with respect to relations between the distributions observed under different 
conditions because any set of such distributions is compatible with any distribution assigned to the conditions. 



Citation: Dzhafarov EN, Kujala JV (2014) Embedding Quantum into Classical: Contextualization vs Conditionalization. PLoS ONE 9(3): e92818. doi:10.1371/journal. 
pone.0092818 

Editor: Gerardo Adesso, University of Nottingham, United Kingdom 

Received November 29, 2013; Accepted February 25, 2014; Publislied March 28, 2014 

Copyright: © 2014 Dzhafarov, Kujala. This Is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This work was supported by NSF grant SES-1 1 55956 to ED. The funders had no role in study design, data collection and analysis, decision to publish, or 
preparation of the manuscript. 

Competing interests: The authors have declared that no competing Interests exist. 
* E-mail: ehtibar@purdue.edu 



Introduction 

Joint Distributions and Stochastic Unrelatedness 

Many scientific problems, from psychology to quantum 
mechanics, can be presented in terms of random outputs of some 
system recorded under various conditions. According to the 
principle of Contextmlity-bj-Default [1^], when applying Kolmo- 
gorov's probability theory (KPT) to such a problem, random 
variables recorded under different, mutually incompatible condi- 
tions should be viewed as stochastically unrelated to each other, i.e., 
possessing no joint distribution. They can always be "sewn 
together" as part of their theoretical analysis, but joint distribu- 
tions are then imposed on them rather than derived from their 
identities. In this paper we discuss two possible approaches to the 
foundational issue of "sewing together" stochastically unrelated 
random variables. We call these approaches contextualization and 
conditionalization. The former takes the Contextuality-by-Default 
principle as its departure point and is, in a sense, its straightfor- 
ward extension; in the latter, Contextuality-by-Default is obtained 
as a byproduct. 

To understand why the Contextuality-by-Default principle is 
associated with either of these two approaches, one should first of 
all abandon the naive notion that in KPT any two random 
variables have a joint distribution uniquely determined by their 
definitions. A random variable is a measurable function on a 
probability space, and the notion of a single probability space for 
all possible random variables (or, equivalently, the notion of a 
single random variable of which all other random variables are 
functions) is untenable. It contradicts the commonly used KPT 



constructions. (In this discussion we impose no restrictions on the 
domain and codomain probability spaces. A random variable 
therefore is understood in the broadest possible way, including 
random vectors, random functions, random sets, etc. We wiU 
avoid, however, the use of general measure-theoretic formalism.) 

One of them is, given any set, to construct a random variable 
whose range of possible values coincides with this set. A probability 
space on which all such random variables were defined would 
have to include a set of cardinality exceeding that of all possible 
sets, an impossibility. 

Another commonly used construction is, given a random 
variable, to introduce another random variable that has a given 
distribution and is stochastically independent of the former. The 
use of this construction contradicts even the notion of a joindy 
distributed set of all variables with a particular distribution [2] , say, 
the set Norm(0,l) of all standard-normally distributed random 
variables. Indeed, if all random variables in Norm(0,l) were 
joindy distributed, they would all be presentable as functions of 
some random variable A^, the identity function on the probability 
space on which the random variables in Norm (0,1) are defined. 
Choose now a standard-normally distributed random variable X 
so that it is independent of N . Then it is also independent of any 
TeNorm(0,l). Since X cannot be independent of itself, X cannot 
belong to Norm(0,l). At the same time, X must belong to 
Norm(0,l) due to its distribution. 

Short of imposing on KPT artificial constraints (such as an 
upper limit on cardinality of the random variables' ranges), these 
and similar contradictions can only be dissolved by allowing for 
stochastically unrelated random variables defined on different 
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probability spaces (see Ref. [5] for how this can be built into the 
basic set-up of probability theory). The principle of Contextuality- 
by-Default eliminates guesswork from deciding which random 
variables are and which are not jointly distributed. Irrespective of 
how one defines a system with random outputs and identifies the 
conditions under which these outputs are recorded, the outputs are 
joindy distributed if they are recorded under one and the same set 
of conditions; if they are recorded under different, mutually 
exclusive conditions, they are stochastically unrelated. 

Two Approaches 

Contextualization and conditionalization differ in how they 
"sew together" stochastically unrelated random variables. To 
demonstrate these differences on a simple example, let X and Y 
be random variables with + 1 / — 1 values, so that their distribu- 
tions are determined by Pr[A'= 1] and Pr[T= 1], respectively. Let 
X and Y be recorded under mutually exclusive conditions. 

In contextualization (the approach we proposed in Refs. [1^]), 
one first invokes the Contextuality-by-Default principle to treat X 
and Y as stochastically unrelated random variables. A "sewing 
together" oi X and Y consists in probabilistically coupling them [6], 
i.e., presenting them as functions of a single random variable. Put 
differently (but equivalently), we create a random variable (vector) 
Z = {X',Y') such that X' is distributed as X and Y' is distributed 
as Y. The variables X'anA Y' are joindy distributed (otherwise Z 
would not be called a random variable, or a random vector), but 
this distribution is not unique. Thus, X and Y can always be 
coupled as stochastically independent random variables, so that 

Pr[X' = l,F' = l] = Pr[jr'=l] X Pr[y'=l]. (1) 

They can also be coupled as identical random variables, 

Pr[l"=r] = l, (2) 
but only if X and Y are distributed identically, 

Pr[A'=l] = Pr[F=l]. (3) 
There can, in fact, be an infinity of couplings, constrained only 

by 

Pr[jr' = 1, r' = 1] + Pr[X' = 1, F' = - 1] = Pr[jr = 1], 
Pr[jr' = l,r' = l]-hPr[X'=-l,y' = l]=Pr[F=l]. ^ ' 

In conditionalization, one creates a random variable C with two 
possible values corresponding to the two sets of conditions under 
which one records X and Y, respectively. Then one defines a 
random variable U ={C,V), such that the conditional distribution 
of V given C = 1 is the same as the distribution of X, and the 
conditional distribution of V given C = 2 is the same as the 
distribution of Y. In other words, 

Pr[K=l|C=l] = Pr[X=l], 

Pr[F=l|C = 2] = Pr[F=l]. ^ ' 

The principle of Contextuality-by-Default here does not have to 
be invoked explicitiy, but it is adhered to anyway: the random 



variable V is related to conditions under which it is recorded, and 
V conditioned on C = 1 clearly has no joint distribution with V 
conditioned on C = 2. 

Conditionalization can also be implemented in more complex 
constructions, such as the one proposed in Ref [7]. In our 
example, this construction amounts to replacing V with two 
random variables, Vi and V2, and "coordinating" their possible 
values with the values of C. Thus, one can make Vi and V2 
binary, -|- 1/ — 1, and define the conditional distributions by 

Pr[Fi=v,K2 = l|C=l]=Pr[Ki=v,K2=-l|C=l] = ^Pr[l'=v], 

1 (6) 

Pr[Fi = l,F2 = v|C=2] = Pr[Ki = -l,F2 = v|C=2]=-Pi|r=v]. 

where v= 1 or — 1. For C= 1, as we see, the "relevant" output is 
Vi, and the probabilities of its values v are simply evenly divided 
between the two possible values of the "irrelevant" output V2 (and 
for C = 2, Vi and V2 exchange places). 

We argue in this paper that only contextualization serves as a 
useful tool for classifying and characterizing different types of 
systems involving random outputs that depend on conditions (e.g., 
classical-mechanical vs quantum-mechanical systems). Conditio- 
nalization, both in its simplest and modified versions, is always 
applicable but uninformative. 

Quantum Entanglement 

Our analysis pertains to any input-output relations, as 
considered in Refs. [1-3,8-11]. The relations can be physical, 
biological, behavioral, social, etc. For the sake of mathematical 
transparency, however, we confine our consideration to the 
canonical quantum-mechanical paradigm [12] involving two 
entangled particles, "Alice's" and "Bob's." Alice measures the 
spin of her particle in one of two directions, ai or a2 (values of the 
first input), and Bob measures the spin of his particle in one of two 
directions, jSj or P2 (values of the second input). Each pair of 
measurements is therefore characterized by one of four possible 
combinations of input values (o;,,/?y), and it is these combinations 
that form the four conditions in this example. The spins recorded in 
each trial are realizations of random variables (outputs) A and B, 
which, in the simplest case of spin-l/2 particles, can attain two 
values each: "up" or "down" (encoded by +1 and — 1, 
respectively). 

Aside from simplicity, another good reason for using this 
example is that it relates to the problem of great interest in the 
foundation of physics: in what way and to what an extent one can 
embed joint probabilities of spins in entangled particles into the 
framework of KPT? It may seem that this question was answered 
by John Bell in his classical papers [13,14], and that the answer 
was: KPT is not compatible with the joint distributions of spins in 
entangled particles. However, in Bell's analysis and its subsequent 
elaborations [15,16] the use of KPT is constrained by an added 
assumption that has nothing to do with KPT. Namely, the implicit 
assumption in these analyses is that of "noncontextuality": 

a spin recorded in Alice's particle is a random variable 
uniquely identified by the measurement setting (spatial axis) 
for which it is recorded (and analogously for Bob's particle). 

In other words, the spin recorded by Alice for settings (X\ and (X2 
are different random variables Ai and A2, but the identity of either 
of them does not depend on whether Bob's setting is Pi or P2 (and 
analogously for Bob's random variables Bi,B2 corresponding to jSj 
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and yS2)- For well-established reasons (discussed in detail below), 
this makes a Kobnogorovian account of quantum entanglement 
impossible. 

However, according to the Contextuality-by-Default principle, 
if one applies it to the Alice-Bob paradigm, 

any two random variables recorded under mutually 
exclusive conditions are labeled by these conditions and 
considered stochastically unrelated. 

Alice's spin values recorded under the condition (ai,/?;) cannot 
co-occur with the spin values recorded by her under the condition 
{0.1,1^2) > even though ai is the same in both conditions. Therefore 
the identity of the spin she measures under (ai,)Si) should be 
viewed as different from the identity of the spin she measures 
under {a.\,fi2). 

This leads one to the double-indexation of the spins, 

A-n,Ai2,A2i,A22,Bii,Bi2,B2i,B22, (7) 

where Ay and By are the measurements by .Alice and Bob, 
respectively, recorded under the condition ije{\,2}. This 

vector of random variables c:annot be called a random vector (or 
random variable, as we use the term broadly), because its 
components are not joindy distributed. Thus, A\\ and A\2, or 
A\\ and B\2, are recorded under mutually exclusive conditions, so 
they do not have jointly observed realizations. But the outputs A\\ 
and B\\, being recorded under one and the same condition 
are jointly distributed, i.e., the joint probabilities for 
difiFerent combinations of co-occurring values of and Bn are 
well-defined. The situation is summarized in the following 
diagram: 




settings, and vice versa. If the two particles are separated by a 
space-like interval, violations of no-signaHng would contravene 
special relativity (and imply the "spooky action at a distance," in 
Einstein's words). 

Nevertheless, in KPT, A cannot be indexed by a, alone, nor can 
B be indexed by Pj alone. 

The logic forbidding single-indexation of the spins, 
A\,A2,B\,B2, is simple [4]. Since, for any ij, the random 
variables A, and Bj are jointly distributed, they are defined on 
the same probability space. Applying this consideration to 
{A\,B\), {A\,B2), and {A2,B\), we are forced to accept that aU 
four random variables, A\,A2,B\,B2, axe defined on one and the 
same probability space. The existence of this joint distribution, 
however, is known to be equivalent to Bell-type inequalities (see 
below), known not to hold for entangled particles. 

Therefore, in perfect compliance with the Contcxtuality-by- 
Default principle, we are forced to use the double indexation (7). 
We can say that while j8y does not influence Ay "directiy" (which 
would be the case if /?y could affect the distribution of Ay), it 
generally creates a "context" for Ay. The context makes An and 
Ai2 two difiFerent random variables with one and the same 
distribution, rather than one and the same random variable. 
(Analogous reasoning applies to By in relation to a,-.) 

It should not, of course, come as a surprise that difiFerent 
random variables can be identically distributed. After all, it is 
perfecth' possible that the distributions of Alice's spins for Xi and 
a2 are identical too, and this would not imply that they are one 
and the same random variable. Within the framework of KPT, the 
difference between A\i and A12 is essentially the same as the 
difiference between A\\ and A2\: in both cases we deal with 
stochastically unrelated random variables, the only difference 
being that in the former pair, unlike in the latter one, the no- 
signahng condition forces the two random variables to be 
identically distributed. The notion of contextuaUty, however, does 
require broadening of one's thinking about how one decides that 
some empirical observations are and some are not realizations of 
one and the same random variable, as understood in KPT [2,3] . 

Theory 

Contextualization and Couplings 

Contextualization is a straightforward extension of the Con- 
textuality-by-Default principle. The latter creates the eight 
random variables in (7), and the contextualization approach 

consists in directly imposing a joint distribution on them. This can, 
of course, be done in infinitely many ways. Any random variable 



Contextuality and No-Signaling 

Why do we speak of "contextuahty" and "noncontextuality"? 
The terms come from quantum mechanics (see, e.g., Refs. [17- 
21]), although it is not always clear that they are used in the same 
meaning as in the present paper. In the Alice-Bob paradigm with 
two spin-l/2 particles, the (marginal) distribution of Alice's 
measurement Ay does not depend on Bob's setting ^j, nor does 
the distribution of Bob's measurement By depend on a,-: 

Pr[^ii = l] = Pr[^i2 = l], Pr[^2i = l] = Pr[^22 = l], 
Pr[5„ = l] = Pr[52i = l], Pr[5i2 = l] = Pr[522 = l]. ^' 

This is known as the no-signaling condition [22]: Alice, by 
watching outcomes of her measurements, is not able to guess Bob's 



Y= {A\„A\2AiAiAiAiAxAi) (10) 

such that, for any ( = 1,2 and j= 1,2, 

(^A'y,B'f^is distributed cis{Ay,Bij), (11) 

is called a (probabilistic) coupling for (7) [6]. The fact that F in (10) 
is referred to as a random variable (or random \'ector) implies that the 
components of Y are jointly distributed, i.e., there is a joint 
probability assigned to each of the 2^ combinations of values for 

The set of all possible couplings (10) for (7) is generally different 
for difiFerent distributions of the pairs [Ay, By) . However, it always 

includes the coupling Y in which the pairs ^A'-j,B'yJ axe 

stochastically independent across difierent (ij). This coupling is 
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referred to as an independent coupling. Its universal applicability leads 
to the common confusion of stochastic unrelatedness with 
stochastic independence. But stochastic independence is merely 
a special property of a joint distribution. 

The non-uniqueness of the coupling (10), rather than being a 
hindrance, can be advantageously used in theoretical analysis. 
According to the All-Possible-Couplings principle formulated in Refs. 
[2.3], 




(14) 



1-9 



a set of stochastically unrelated random variables is 
characterized by the set of all possible couplings that can 
be imposed on them, with no couplings being a priori 
privileged. 

Thus, according to Ref [1], the set of all possible couplings for 
(7) can be used to characterize various constraints imposed on the 
joint distributions oi Aij,Bij in (8). 

From the point of view of all possible couplings, the 
noncontextuality assumption leading to the single-indexation of 
the spins, Ai,Bj, is equivalent to imposing an identity coupling on the 
double-indexed outputs in (7), i.e., creating a coupling (10) - (11) 
with the additional constraint 



Pr[4l=^12] = l, VV[A'2X=A'22] = \, 

Pr[S(i =5ii] = l, Pr[5j2=S^2] = l. 



(12) 



The Bell-type theorems [13-16] tell us that this coupling exists if 
and only if both the no-signaling condition is satisfied and the four 
observable joint distributions of (^,y,5j,) satisfy the inequalities 



|<^„^„> + <^12^12> + a2l521>-a22522>|<2, 
K^n^ii > <^125i2> - <^2l52I > -I- <^22522>| < 2, 
I <^„5„ > - <^125l2> + <^2l521 > + a22522>| < 2, 
I - <^u5ll > + <^125l2> + <^21^21 > + <^22522>| < 2, 



(13) 



where . • .) denotes expected value. Clearly, these inequalities do 
not have to be satisfied, and, in the Alice-Bob paradigm, for some 
quadruples of settings (ci.\,ci.2,P\,fi2), these inequalities are 
contravened by quantum theory and experimental data. 

Therefore, we have to use double-indexing and consider 
couphngs other than the identity couphng (12). This is the essence 
of the contextualization approach, when apphed to the Alice-Bob 
paradigm. In the conditionalization approach, discussed next, one 
also uses what can be thought of as a version of double-indexation 
(conditioning on the two indices viewed as values of a random 
variable), but instead of the couplings in the sense of (10) - (1 1) one 
uses a difierent theoretical construct, conditional couplings. 

Conditionalization and Conditional Couplings 

One of the simplest ways of creating stochastically unrelated 
random variables is to consider a tree of possibilities, like this one: 




We have at the first stage outcomes a and b, and according as 
which of them is reaUzed, the choice between c and d occurs with 
generally different probabilities. We can consider a and b as two 
mutually exclusive conditions, and use them to label the two 
random variables 



c with probability p, 

d with probability 1 —p, 

c with probability q, 

d with probability \ —q. 



(15) 



Clearly, Xa and Xf, here do not have a joint distribution: e.g., no 
joint probability Pr[Jfa = c,Xj = c] is defined because there is no 
commonly acceptable meaning in which Xa = c may "co-occur" 
with Xi, = c. The two random variables here are stochastically 
unrelated, in conformance with the Contextuality-by-Default 
principle. 

The AU-Possible-Couplings principle leads us to consider all 
joint distributions 





X', = c 


X't=d 




r 


p-r 


K = '' 


q-r 


\—p—q+r 



with 



max(0,/i-|-g — 1) <r< mm(p,q). 



(16) 



(17) 



Each r within this range defines a possible coupling 
Y=(X'^,X'ij) for Xa and Xj. In particular, the independent 
coupling, with r=pq, is within the range, while the identity 
coupling, with Pr[Xa = Jfj] = 1, is possible if and only ii r=p = q. 

There is, however, a more traditional view o{ Xa and Xi, in (14). 
It consists in considering a joint distribution of two random 
variables, C and X, with the marginal distributions 



a with probability tc, 
b with probability 1 — tc, 

with probability p%-\-q{\—K), 
d with probability ( 1 —p) ■K-\-{\—q){\—n), 

and with the joint distribution 



X-- 



(18) 
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X = c 


X = d 


C = a 


pn 


{\-P)k 


C = b 


qil-n) 


{\-q){\-n) 



(19) 



Xa is then interpreted as X given C = a, and analogously for Aj. 
The conditional probabilities are computed as required, 

p=Pr[X = c\C = a], q=Pr[X = c\C = b]. (20) 

The idea suggested by this simple exercise is this: 

consider any set of stochastically unrelated random outputs 
labeled by mutually exclusive conditions as if these 
conditions were values of some random variable, and the 
outputs w("re values of another random variable conditioned 
upon the values of the former. 

We call this approach conditionalization. It may seem to 
provide a simple alternative, within the framework of KPT, to 
considering all couplings imposable on stochastically unrelat(;d 
variables. We will argue, however, that this alternative is not 
theoretically interesting. 

Consider the conditionalization of our Alice-Bob paradigm. 
Denote, for /= 1,2 and j= 1,2, 

p,= Pr[Ay = l], (21) 

Introduce a random variable C with four values 

Cij={iXi,pj), ij,e{l,2}, 
and a random variable X = (A' ,B') with four values 

(1,1),(1,-1),(-1,1),(-1,-1). 

Form the tree of outcomes as shown below, using arbitrarily 
chosen positive probabiUties 7Cii,7Ci2,7C2l,?t22 (summing to 1): 




(1,1) (1,-1) (-1,1) (-1,-1) 

The conditionalization is completed by computing the joint 
distribution of C and {A',^): 





{A',B') = {i,r 


(1,-1) 


(-1,1) 


(-1,-1) 














PiJ^ii 


{pt-Pij)ni 


(pi-Pii)ni 


[l-pt-p-j+Pij)T<:il 










(23) 



Clearly, we have constructed a random variable 

Z = {C,{A',b!)) (24) 

such that 

{A',:ff) given C= (a„jS,)is distributed as {Aij,Bij) . (25) 

This Z can be called a conditional coupling for i^Aij,Bij), 
;V6{1,2}. 

The conditionalization procedure does not have to claim the 
existence of any "true" or unique distribution of C. One can freely 
concoct this distribution, even if the conditions under which A and 
B are measured are chosen at will or according to a deterministic 
algorithm. 

There are two interesting modifications of conditionalization, 
both proposed in a recent paper by Avis, Fischer, HUbert, and 
Khrennikov [7]. Instead of the conditional coupling Z in (24), they 
consider 

Z'={C,{A\,A'^,B!^,B!^)) (26) 

such that 

(^A'i,b!^ given C= (a.i,fij)is distributed as [Aij,Bij). (27) 
In other words, 

Pr = + 1,5;. = + 1 1 C = {ai,Pj)\ = PT[Aij =±l,Bij=± 1] (28) 

This does not yet define the conditional probabilities for all 
possible values of (A'-^,A'2,B\,Bf^ . Avis et al. describe two ways of 
defining them. 

In one of them A\,A'2,ffi,S2 have two possible values each, + 1, 
and 

Vr\A\ = a,B:j = b,A'^_, = d,B:^_j = b',\ C= (a,^)] 

1 (29) 

= -Vr[A,j=a,B.j = b]. 

That is, the probability of (^A\ = a,ffj = b^ at C= (a,-,/?,) is 
evenly partitioned among the four values of the "irrelevant" pair 
{^A'2^_^,S^_^ . It is easy to see that one could as well use any other 
partitioning: 
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FT[A', = a,B'j = b,A',_, = a',B',_j = b',\ C= ^^^^ 
= tij{c^,b')Pr[Aij = a,Bij = b], 
with nonnegative tij{a',b') subject to 

tij{hl)+ty{l-l)+ty{-hl)+ty{-l-l) = ljje{h2}- 

Now, for any distribution of C with non-zero values of 
Pr[C=(ai,iS,)], the joint distribution of {C ,A'\A'2^B!^^B!^ is 
well-defined. 

Another way of implementing (28) described in Ref [7] is to 
allow each of A\,A'2,B\,B'2 to attain a third value, say, 0, in 
addition to +1. This third value can be interpreted as "is not 
defined." It is postulated then that 

Vr[A\ = a,B!j = b,A'^_i = d,li^_j = b',\ C= (a,^)] 

{Vr[A,j = a,Bij = b\ if a¥^Q,b=^Q,a' = b' = Q, (31) 
[ 0 otherwise. 

Again, it is easy to see that the joint distribution of 
{C,A\,A'2,B\,B'2) is well-defined and satisfies (28) for any 
distribution of C with non-zero values of Pr[C=(ai,;S,)]. 

Discussion 

Comparing the Two Approaches 

Conditionalization and contextualization achieve the same goal 
— "sewing together" stochastically unrelated random variables 
within the confines of KPT. But the similarity ends there. 
Consider, e.g., the Alice-Bob experiment in which both Alice 
and Bob use some random generators to choose between two 
possible measurement directions. Clearly then C is objectively a 
random variable, and a joint distribution of {A,B) and C 
objectively exists. Put differently, in this case {A',B') given 
C= in (25) is simply equal to [Aij,Bij). 

However, whether C is objectively a random variable or a 
distribution for the settings is invented, the quantum-mechanical 
analysis of the situation begins with computing the (conditional) 
distributions of {A,B) at different settings. The distribution of C in 
no way advances our understanding of how [Aij,Bij) for different 
{ij) are related to each other. 

Thus, we know that the entangled spin-12 particles are subject 
to Tsirelson's inequalities [24] 

|<^llfill> + <^125i2>+<^2l521>-<^22522>|<2%/2, 

\<,AnBny + {A^Bn > - <^2i ^21 > + <^22522 > | < 2^2, 

I <^ll5„ > - <^125l2> + <^2l521 > + <^22522>| < 2\/2, 
I - <^llSii > -I- <^125i2> -I- <^2l521 > + <^22522>| <2\/2. 



any way restrict the possible choices of "imaginary" distributions 
of C. In fact, the only restriction imposed on the distribution of C, 
a universal one, is that none of the conditions should have 
probabilit)' zero, because this would make the conditional 
probabilities undefined. Moreover, the set of possible conditional 
couplings is the same whether the no-signaling condition is or is 
not satisfied. 

Although in this discussion we assumed that conditionalization 
was implemented in its simplest version, (24) - (2.5), our arguments 
and conclusions apply verbatim to the modifications proposed in 
Ref [7] and described at the end of the previous section. The 
conditional distributions of ^'j ,^2'^1 '^'2 values of C in 

(29) and (31) are uniquely determined by the observed distribu- 
tions of the four pairs iyAij,Bij). But whatever these distributions, 
they can be paired with any distribution of C, provided none of its 
values has zero probability. 

AU of this stands in a clear contrast to the analysis of all possible 
couplings (10) in the contextualization approach [1^]. In this 
approach we can ask various questions about the compatibility of 
couplings with various constraints known to hold for the 
observable joint distributions. Thus, we may ask about the fitting 
set of couplings for a given constraint (say. Bell or Tsirelson 
inequalities), i.e., the couplings that are compatible with the spin 
distributions subject to the constraint. We can also ask about the 
forcing set of couplings, those compatible only with the spin 
distributions subject to a given constraint. Or we can conjoin the 
two questions and ask about the equivalent set of couplings, those 
compatible with and only with the spin distributions subject to the 
constraint. The answers to such questions will be different for 
difierent constraints being considered. 

Since the four observed joint distributions of (^A'y,Bf^ in (11) 

are themselves part of the couplings (10), the questions above are 
only interesting if they are formulated in terms of the unobservable 
parts of the couplings. In the examples below we characterize the 
couplings in terms of the connections [1,2,4], which are the 
(unobservable) pairs 

(^[[,^'[2), (^21 '^22 ).(^11.^2l).(^12.42)- (33) 

The diagram below shows the connections in their relation to 
the pairs whose joint distributions are known from observations 
(compare with diagram (8)): 




We also know that if the two particles were not entangled, they 
would be subject to the Bell-CH-Fine inequalities (13). The 
difference between these two constraints is not reflected in the 
"true" distribution of C, if it exists, nor is it implied by or can in 



Let us assume that the probability of spin-up (-1- 1) outcome for 
every (spin-y2) particle in the Alice-Bob paradigm is 1/2- (As shown 
in Ref [23], this can always be achieved by a simple procedural 
modification of the canonical Alice-Bob experiment.) This 
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assumption is, of course, in compliance with tiie no-signaling 
condition, which therefore can be omitted from all formulations 
below. 

We know [1] that the following two statements about 
connections are equivalent: 



these ways are different for the BeU-CH-Fine and Tsirelson 
inequalities. 

What can contextualization tell us about the basic predictions of 
the quantum theory for the Alice-Bob experiment? The theory 
tells us that, for i=\,l and j= 1,2, 



{S\) a vector of connections (33) is compatible with and only 
with those distributions of (Aij,Bij), ije{\,2}, that satisfy the 
Bell-CH-Fine inequalities (13); 

is equivalent to 



iAyBy')=-{ai\fij-), (39) 

where <a,|j6y) is the dot product of two unit vectors. It can be 
shown [25-27] that the four expectations (^AyBy'} can be 
presented in the form (39) using a quadruple of setting 
{a\,a.2,Pi,P-i) if and only if 



(5'[) a vector of connections (33) is such that 



|arcsin<^iiBii>+ arcsin<^i2-Bi2>+ arcsin<^2ifi2i>— arcsin<^22S22>| <Jt, 
|arcsin<^iiBii>+ arcsin<^i2-Bi2>— arcsin<^2iS2i>+ arcsin<^22522>l 
|arcsin<^iiBii>— arcsin<y4i2-Bi2>+ arcsin<y42i-B2i>+ arcsin<^22B22>| <n. 



{,A\\A\2y= + 1,<^^21^22) = +1, |- arcsm<^iiSii>+ arcsm<^i2-Bi2>+ arcsm<^2i.S2i>+ arcsin<^22fe>|<Jt- 

<5n521>=+l,<5,2522>=+l, ^^^^ 



where the number of -I- signs among the four expected values 
is 4,2, or 0. 

The equivalence of these two statements is an expanded version 
of Fine's theorem [16], whose formulation in the language of 
connections is: the identity connections, those with 

<^11^12> = <^21^22> = <BnB2l > = <5i2522> = 1, (36) 

are only comjjatible with distributions of iyAjj,Bij^ satisfying the 
BeU-CH-Fine inequalities; and if these inequalities hold, then 
[Aij,Bij^ can be coupled by means of the identity connections. 

We also know [1] that the following two statements about 
connections are equivalent: 

(^2) a vector of connections (33) is compatible with and only 
with those distributions of (^Ajj,Bij^, ij,e{l,2}, that satisfy 
the Tsirelson inequalities (32); 

is equivalent to 

(^2) a vector of connections (33) is such that 



These inequalities are "sandwiched" between the Bell-CH-Fine 
ones and Tsirelson ones. That is, they are implied by the former 
and imply the latter. It is shown in Ref [4] that 

(S3) there is no vector of connections (33) that is compatible 
with and only with those distributions of {Aij,Bij), i,7',e{l,2}, 
that satisfy the quantum inequalities (40). 

Moreover, this negative statement still holds if one replaces the 
connections (33) with any other subsets of (10), e.g., 

[A'n,A'u,A'2uA'22),{B'n Mi ,B'n M2 )■ (41) 

No distributions of such subsets are compatible with and only 
with those distributions of (Aij,Bjj) that satisfy the quantum 
inequalities (40). 

The investigation of the forcing set of couplings provides 
additional insights into the special nature of quantum mechanics. 
The result we have [4] says that the following two statements 
about connections are equivalent (note the change from "with and 
only with" of the previous statements to "only with"): 



max{+<^ii^i2>±<^2i^22>±<5n52i>+<5i2fi22> : number of +'s is even} 
=2(3-^/2) 

(37) 

and 

max{+ <^ii^i2>+<j42i^22)±'(BiiB2i)±<Bi2-B22):number of +'s is odd} 
<2. 

(38) 



We see that although the expectations ^^a^a) and <(.Sy.S2/)> for 
the connections are not observable, they provide a theoretically 

meaningful way of characterizing the way in which the stochastically 
unrelated and observable (Aij,Bij) are being "sewn together." And 



(S4) a vector of connections (33) is compatible only with those 
distributions of (Aij,Bij), ije{\,2}, that satisfy the quantum 
inequalities (40); 

is equivalent to 

(5*4) a vector of connections (33) is compatible only with those 
distributions of {Aij,By), ije{\,2}, that satisfy the BeU-CH- 
Fine inequalities (13). 

In other words, a choice of connections can force aU (Aij,Bij) 
compatible with them to comply with quantum mechanics only in 
the form of their compUance with classical mechanics. 

Conclusion 

The examples just given should sufSce to illustrate the point 
made: while both contextuaUzation and conditionalization embed 
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any input-output relation into the framework of KPT, only 
contextualization provides a useful tool for understanding the 
nature of various constraints imposed on the observable joint 
distributions (one could say also, for different types and levels of 
contextuality). Conditionalization is uninformative, as any distri- 
bution of the conditions is compatible with any distributions of the 
conditional random variables. 
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